Overview

Dataset statistics

Number of variables10
Number of observations824
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory64.5 KiB
Average record size in memory80.2 B

Variable types

Numeric10

Warnings

water is highly correlated with superplasticizerHigh correlation
superplasticizer is highly correlated with waterHigh correlation
water is highly correlated with superplasticizerHigh correlation
superplasticizer is highly correlated with waterHigh correlation
age is highly correlated with csMPaHigh correlation
csMPa is highly correlated with ageHigh correlation
water is highly correlated with superplasticizerHigh correlation
superplasticizer is highly correlated with waterHigh correlation
csMPa is highly correlated with Id and 2 other fieldsHigh correlation
fineaggregate is highly correlated with Id and 5 other fieldsHigh correlation
Id is highly correlated with csMPa and 7 other fieldsHigh correlation
water is highly correlated with fineaggregate and 5 other fieldsHigh correlation
coarseaggregate is highly correlated with fineaggregate and 6 other fieldsHigh correlation
slag is highly correlated with fineaggregate and 5 other fieldsHigh correlation
cement is highly correlated with csMPa and 7 other fieldsHigh correlation
flyash is highly correlated with Id and 3 other fieldsHigh correlation
superplasticizer is highly correlated with csMPa and 7 other fieldsHigh correlation
Id has unique values Unique
slag has 377 (45.8%) zeros Zeros
flyash has 461 (55.9%) zeros Zeros
superplasticizer has 304 (36.9%) zeros Zeros

Reproduction

Analysis started2021-11-23 13:19:51.511416
Analysis finished2021-11-23 13:20:04.659527
Duration13.15 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

Id
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct824
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean513.8470874
Minimum0
Maximum1028
Zeros1
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size6.6 KiB
2021-11-23T18:50:04.738986image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile50.15
Q1251.75
median513.5
Q3770.25
95-th percentile974.85
Maximum1028
Range1028
Interquartile range (IQR)518.5

Descriptive statistics

Standard deviation296.7867789
Coefficient of variation (CV)0.5775780115
Kurtosis-1.20657835
Mean513.8470874
Median Absolute Deviation (MAD)259.5
Skewness0.000627465727
Sum423410
Variance88082.39214
MonotonicityNot monotonic
2021-11-23T18:50:04.841854image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
0.1%
6931
 
0.1%
6781
 
0.1%
6801
 
0.1%
6811
 
0.1%
6831
 
0.1%
6841
 
0.1%
6861
 
0.1%
6881
 
0.1%
6901
 
0.1%
Other values (814)814
98.8%
ValueCountFrequency (%)
01
0.1%
11
0.1%
41
0.1%
51
0.1%
61
0.1%
71
0.1%
81
0.1%
91
0.1%
111
0.1%
121
0.1%
ValueCountFrequency (%)
10281
0.1%
10271
0.1%
10261
0.1%
10241
0.1%
10231
0.1%
10221
0.1%
10211
0.1%
10191
0.1%
10181
0.1%
10171
0.1%

cement
Real number (ℝ≥0)

HIGH CORRELATION

Distinct254
Distinct (%)30.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean283.360801
Minimum102
Maximum540
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.6 KiB
2021-11-23T18:50:04.948979image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum102
5-th percentile143.615
Q1192
median275.1
Q3359.9
95-th percentile491
Maximum540
Range438
Interquartile range (IQR)167.9

Descriptive statistics

Standard deviation107.5364039
Coefficient of variation (CV)0.3795034581
Kurtosis-0.6077758768
Mean283.360801
Median Absolute Deviation (MAD)83.9
Skewness0.4933427896
Sum233489.3
Variance11564.07816
MonotonicityNot monotonic
2021-11-23T18:50:05.041966image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
42517
 
2.1%
362.616
 
1.9%
47513
 
1.6%
251.413
 
1.6%
31013
 
1.6%
25012
 
1.5%
34912
 
1.5%
44611
 
1.3%
23610
 
1.2%
33110
 
1.2%
Other values (244)697
84.6%
ValueCountFrequency (%)
1024
0.5%
108.34
0.5%
1163
0.4%
122.64
0.5%
1322
0.2%
1333
0.4%
133.11
 
0.1%
134.71
 
0.1%
1351
 
0.1%
135.72
0.2%
ValueCountFrequency (%)
5407
0.8%
531.35
0.6%
5281
 
0.1%
5257
0.8%
5222
 
0.2%
5202
 
0.2%
5162
 
0.2%
500.11
 
0.1%
50010
1.2%
4917
0.8%

slag
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct166
Distinct (%)20.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean74.37160194
Minimum0
Maximum359.4
Zeros377
Zeros (%)45.8%
Negative0
Negative (%)0.0%
Memory size6.6 KiB
2021-11-23T18:50:05.135708image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median22
Q3144.775
95-th percentile236
Maximum359.4
Range359.4
Interquartile range (IQR)144.775

Descriptive statistics

Standard deviation86.97778445
Coefficient of variation (CV)1.169502635
Kurtosis-0.5182968517
Mean74.37160194
Median Absolute Deviation (MAD)22
Skewness0.8020652798
Sum61282.2
Variance7565.134988
MonotonicityNot monotonic
2021-11-23T18:50:05.233560image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0377
45.8%
18924
 
2.9%
106.317
 
2.1%
2411
 
1.3%
209
 
1.1%
98.19
 
1.1%
198
 
1.0%
1458
 
1.0%
267
 
0.8%
1166
 
0.7%
Other values (156)348
42.2%
ValueCountFrequency (%)
0377
45.8%
114
 
0.5%
13.62
 
0.2%
155
 
0.6%
17.21
 
0.1%
17.51
 
0.1%
17.61
 
0.1%
198
 
1.0%
209
 
1.1%
226
 
0.7%
ValueCountFrequency (%)
359.42
 
0.2%
342.11
 
0.1%
316.12
 
0.2%
305.33
0.4%
290.22
 
0.2%
2884
0.5%
282.83
0.4%
272.81
 
0.1%
262.25
0.6%
2601
 
0.1%

flyash
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct130
Distinct (%)15.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean53.16080097
Minimum0
Maximum195
Zeros461
Zeros (%)55.9%
Negative0
Negative (%)0.0%
Memory size6.6 KiB
2021-11-23T18:50:05.343852image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q3118.3
95-th percentile166.85
Maximum195
Range195
Interquartile range (IQR)118.3

Descriptive statistics

Standard deviation64.0006463
Coefficient of variation (CV)1.203906734
Kurtosis-1.321913729
Mean53.16080097
Median Absolute Deviation (MAD)0
Skewness0.5660377685
Sum43804.5
Variance4096.082726
MonotonicityNot monotonic
2021-11-23T18:50:05.440920image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0461
55.9%
118.315
 
1.8%
14114
 
1.7%
24.513
 
1.6%
7911
 
1.3%
121.69
 
1.1%
949
 
1.1%
95.79
 
1.1%
100.48
 
1.0%
1678
 
1.0%
Other values (120)267
32.4%
ValueCountFrequency (%)
0461
55.9%
24.513
 
1.6%
591
 
0.1%
711
 
0.1%
71.51
 
0.1%
761
 
0.1%
772
 
0.2%
782
 
0.2%
78.41
 
0.1%
7911
 
1.3%
ValueCountFrequency (%)
1953
0.4%
194.91
 
0.1%
1941
 
0.1%
1901
 
0.1%
185.31
 
0.1%
1852
0.2%
1841
 
0.1%
183.91
 
0.1%
182.11
 
0.1%
1821
 
0.1%

water
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct179
Distinct (%)21.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean181.7970874
Minimum121.8
Maximum247
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.6 KiB
2021-11-23T18:50:05.541984image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum121.8
5-th percentile146.13
Q1164.9
median185.35
Q3192
95-th percentile228
Maximum247
Range125.2
Interquartile range (IQR)27.1

Descriptive statistics

Standard deviation21.32190452
Coefficient of variation (CV)0.1172840821
Kurtosis0.1765762128
Mean181.7970874
Median Absolute Deviation (MAD)13
Skewness0.09197296622
Sum149800.8
Variance454.6236124
MonotonicityNot monotonic
2021-11-23T18:50:05.643940image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
19297
 
11.8%
22843
 
5.2%
185.736
 
4.4%
203.530
 
3.6%
18625
 
3.0%
16217
 
2.1%
164.916
 
1.9%
153.513
 
1.6%
20012
 
1.5%
19311
 
1.3%
Other values (169)524
63.6%
ValueCountFrequency (%)
121.85
0.6%
126.64
0.5%
137.83
0.4%
1401
 
0.1%
140.84
0.5%
141.85
0.6%
1421
 
0.1%
143.32
 
0.2%
144.74
0.5%
1453
0.4%
ValueCountFrequency (%)
2471
 
0.1%
246.91
 
0.1%
2371
 
0.1%
236.71
 
0.1%
22843
5.2%
221.41
 
0.1%
2211
 
0.1%
220.11
 
0.1%
2202
 
0.2%
219.71
 
0.1%

superplasticizer
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct105
Distinct (%)12.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.163956311
Minimum0
Maximum32.2
Zeros304
Zeros (%)36.9%
Negative0
Negative (%)0.0%
Memory size6.6 KiB
2021-11-23T18:50:05.744028image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median6.1
Q310.125
95-th percentile16.085
Maximum32.2
Range32.2
Interquartile range (IQR)10.125

Descriptive statistics

Standard deviation5.967257716
Coefficient of variation (CV)0.9680889051
Kurtosis1.265712251
Mean6.163956311
Median Absolute Deviation (MAD)5.6
Skewness0.8977497366
Sum5079.1
Variance35.60816464
MonotonicityNot monotonic
2021-11-23T18:50:05.834874image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0304
36.9%
11.628
 
3.4%
817
 
2.1%
714
 
1.7%
9.913
 
1.6%
7.813
 
1.6%
16.513
 
1.6%
913
 
1.6%
612
 
1.5%
1112
 
1.5%
Other values (95)385
46.7%
ValueCountFrequency (%)
0304
36.9%
1.74
 
0.5%
1.91
 
0.1%
21
 
0.1%
2.52
 
0.2%
36
 
0.7%
3.11
 
0.1%
3.43
 
0.4%
3.65
 
0.6%
3.97
 
0.8%
ValueCountFrequency (%)
32.23
0.4%
28.25
0.6%
23.44
0.5%
22.11
 
0.1%
225
0.6%
20.81
 
0.1%
191
 
0.1%
18.64
0.5%
18.31
 
0.1%
181
 
0.1%

coarseaggregate
Real number (ℝ≥0)

HIGH CORRELATION

Distinct258
Distinct (%)31.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean973.5485437
Minimum801
Maximum1145
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.6 KiB
2021-11-23T18:50:05.935901image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum801
5-th percentile842
Q1932
median968
Q31040.6
95-th percentile1104.51
Maximum1145
Range344
Interquartile range (IQR)108.6

Descriptive statistics

Standard deviation78.69463012
Coefficient of variation (CV)0.08083277473
Kurtosis-0.6438676258
Mean973.5485437
Median Absolute Deviation (MAD)52.5
Skewness-0.04148492161
Sum802204
Variance6192.84481
MonotonicityNot monotonic
2021-11-23T18:50:06.031646image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
93246
 
5.6%
852.139
 
4.7%
944.724
 
2.9%
112521
 
2.5%
96820
 
2.4%
96716
 
1.9%
104715
 
1.8%
94210
 
1.2%
8229
 
1.1%
9389
 
1.1%
Other values (248)615
74.6%
ValueCountFrequency (%)
8014
0.5%
801.41
 
0.1%
8111
 
0.1%
8141
 
0.1%
814.11
 
0.1%
817.91
 
0.1%
8191
 
0.1%
819.21
 
0.1%
8201
 
0.1%
8229
1.1%
ValueCountFrequency (%)
11451
 
0.1%
1134.35
 
0.6%
11301
 
0.1%
112521
2.5%
1124.42
 
0.2%
11201
 
0.1%
11192
 
0.2%
1118.81
 
0.1%
11181
 
0.1%
11131
 
0.1%

fineaggregate
Real number (ℝ≥0)

HIGH CORRELATION

Distinct274
Distinct (%)33.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean772.1074029
Minimum594
Maximum992.6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.6 KiB
2021-11-23T18:50:06.140285image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum594
5-th percentile613
Q1726.775
median778.5
Q3821.25
95-th percentile895.895
Maximum992.6
Range398.6
Interquartile range (IQR)94.475

Descriptive statistics

Standard deviation80.98471665
Coefficient of variation (CV)0.1048878904
Kurtosis-0.1344581102
Mean772.1074029
Median Absolute Deviation (MAD)45.5
Skewness-0.2399879142
Sum636216.5
Variance6558.524332
MonotonicityNot monotonic
2021-11-23T18:50:06.242171image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
755.824
 
2.9%
59424
 
2.9%
61320
 
2.4%
67018
 
2.2%
80114
 
1.7%
746.613
 
1.6%
887.113
 
1.6%
71211
 
1.3%
84510
 
1.2%
780.110
 
1.2%
Other values (264)667
80.9%
ValueCountFrequency (%)
59424
2.9%
6055
 
0.6%
611.85
 
0.6%
6121
 
0.1%
61320
2.4%
613.22
 
0.2%
6232
 
0.2%
6303
 
0.4%
6314
 
0.5%
6332
 
0.2%
ValueCountFrequency (%)
992.64
0.5%
9452
0.2%
943.14
0.5%
9424
0.5%
925.74
0.5%
905.94
0.5%
903.83
0.4%
903.64
0.5%
901.84
0.5%
900.94
0.5%

age
Real number (ℝ≥0)

HIGH CORRELATION

Distinct14
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean44.66140777
Minimum1
Maximum365
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.6 KiB
2021-11-23T18:50:06.336237image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q112.25
median28
Q356
95-th percentile180
Maximum365
Range364
Interquartile range (IQR)43.75

Descriptive statistics

Standard deviation60.47570164
Coefficient of variation (CV)1.354093045
Kurtosis13.07549467
Mean44.66140777
Median Absolute Deviation (MAD)21
Skewness3.33541121
Sum36801
Variance3657.310489
MonotonicityNot monotonic
2021-11-23T18:50:06.408883image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
28350
42.5%
3110
 
13.3%
794
 
11.4%
5672
 
8.7%
1446
 
5.6%
9044
 
5.3%
10039
 
4.7%
18021
 
2.5%
9120
 
2.4%
2709
 
1.1%
Other values (4)19
 
2.3%
ValueCountFrequency (%)
12
 
0.2%
3110
 
13.3%
794
 
11.4%
1446
 
5.6%
28350
42.5%
5672
 
8.7%
9044
 
5.3%
9120
 
2.4%
10039
 
4.7%
1203
 
0.4%
ValueCountFrequency (%)
3659
 
1.1%
3605
 
0.6%
2709
 
1.1%
18021
 
2.5%
1203
 
0.4%
10039
 
4.7%
9120
 
2.4%
9044
 
5.3%
5672
 
8.7%
28350
42.5%

csMPa
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct701
Distinct (%)85.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35.85786408
Minimum2.33
Maximum82.6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.6 KiB
2021-11-23T18:50:06.504037image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum2.33
5-th percentile11.3645
Q123.685
median34.08
Q345.8625
95-th percentile67.28
Maximum82.6
Range80.27
Interquartile range (IQR)22.1775

Descriptive statistics

Standard deviation16.86509934
Coefficient of variation (CV)0.4703319557
Kurtosis-0.2738606121
Mean35.85786408
Median Absolute Deviation (MAD)10.81
Skewness0.4619332841
Sum29546.88
Variance284.4315757
MonotonicityNot monotonic
2021-11-23T18:50:06.605556image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
33.44
 
0.5%
23.524
 
0.5%
77.34
 
0.5%
71.34
 
0.5%
79.34
 
0.5%
39.33
 
0.4%
31.353
 
0.4%
43.73
 
0.4%
64.33
 
0.4%
17.543
 
0.4%
Other values (691)789
95.8%
ValueCountFrequency (%)
2.331
0.1%
3.321
0.1%
4.571
0.1%
4.781
0.1%
4.91
0.1%
6.271
0.1%
6.471
0.1%
6.811
0.1%
6.941
0.1%
7.321
0.1%
ValueCountFrequency (%)
82.61
 
0.1%
81.751
 
0.1%
80.21
 
0.1%
79.991
 
0.1%
79.41
 
0.1%
79.34
0.5%
78.81
 
0.1%
77.34
0.5%
76.81
 
0.1%
76.241
 
0.1%

Interactions

2021-11-23T18:49:53.549565image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:53.683511image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:53.888272image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:53.996882image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:54.105062image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:54.203170image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:54.309268image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:54.434930image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:54.543761image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:54.643948image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:54.742687image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:54.830088image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:54.916874image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:55.028790image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:55.147352image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:55.251077image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:55.357220image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:55.464351image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:55.568074image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:55.662394image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:55.757242image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:55.854003image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:55.950640image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:56.053328image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:56.156982image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:56.258185image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:56.360273image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:56.465334image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:56.575626image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:56.678629image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:56.783496image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:56.885056image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:56.983850image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:57.090659image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:57.198866image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:57.302991image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:57.407638image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:57.515959image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:57.625923image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:57.730575image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:57.841136image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:57.936187image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:58.030675image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:58.132025image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:58.235840image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:58.332769image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:58.533536image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:58.636197image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:58.740146image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:58.839869image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:58.943235image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:59.039040image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:59.131725image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:59.231393image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:59.333294image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:59.433305image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:59.529979image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:59.631635image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:59.739179image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:59.838961image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:49:59.941570image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:00.043553image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:00.142415image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:00.248046image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:00.356607image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:00.461587image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:00.563881image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:00.670464image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:00.780804image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:00.885972image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:00.994715image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:01.097703image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:01.200348image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:01.309736image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:01.422477image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:01.528108image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:01.634021image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:01.746706image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:01.858469image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:01.968657image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:02.081510image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:02.204181image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:02.310224image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:02.422076image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:02.529577image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:02.630026image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:02.731505image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:02.839984image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:02.950788image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:03.053445image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:03.163288image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:03.265943image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:03.363614image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:03.472439image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:03.580275image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:03.683928image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:03.787989image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:03.895059image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:04.006022image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-23T18:50:04.236768image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-11-23T18:50:06.702138image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-11-23T18:50:06.850777image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-11-23T18:50:06.991710image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-11-23T18:50:07.135663image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-11-23T18:50:04.417345image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-11-23T18:50:04.590484image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

IdcementslagflyashwatersuperplasticizercoarseaggregatefineaggregateagecsMPa
0995158.6148.9116.0175.115.0953.3719.72827.68
1507424.022.0132.0178.08.5822.0750.02862.05
2334275.10.0121.4159.59.91053.6777.5323.80
3848252.097.076.0194.08.0835.0821.02833.40
4294168.942.2124.3158.310.81080.8796.237.40
5286181.40.0167.0169.67.61055.6777.82827.77
6938154.8183.40.0193.39.11047.4696.72818.29
7447178.0129.8118.6179.93.61007.3746.85648.59
8692212.0141.30.0203.50.0973.4750.09039.70
9652102.0153.00.0192.00.0887.0942.034.57

Last rows

IdcementslagflyashwatersuperplasticizercoarseaggregatefineaggregateagecsMPa
814308277.10.097.4160.611.8973.9875.610055.64
815661141.3212.00.0203.50.0971.8748.5710.39
816130323.7282.80.0183.810.3942.7659.92874.70
817663133.0200.00.0192.00.0927.4839.22827.87
818871159.0187.00.0176.011.0990.0789.02832.76
81987286.3200.90.0144.711.21004.6803.7324.40
820330246.80.0125.1143.312.01086.8800.91442.22
821466190.30.0125.2166.69.91079.0798.910033.56
822121475.0118.80.0181.18.9852.1781.52868.30
823860314.00.0113.0170.010.0925.0783.02838.46